AITopics | language adapter

Collaborating Authors

language adapter

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Analysis of AdvFusion: Adapter-based Multilingual Learning for Code Large Language Models

Esmaeili, Amirreza, Seddik, Fahd, Ji, Yongyi, Fard, Fatemeh, Chen, Fuxiang

arXiv.org Artificial IntelligenceNov-6-2025

Programming languages can benefit from one another by utilizing a language model for software engineering tasks. Full fine-tuning and Parameter Efficient Fine-Tuning (PEFT) of Code Language Models (Code-LMs) has been explored for multilingual knowledge transfer. AdapterFusion is a PEFT architecture that aims to enhance task performance by leveraging information from multiple programming languages, but primarily focuses on the target programming language. In our previous work, we proposed AdvFusion, a novel PEFT-based approach that effectively learns from other programming languages before adapting to the target task. Though previous experiments showed that AdvFusion outperformed AdapterFusion and LoRA, it was applied on pre-trained Code-LMs and was limited to only two tasks, code summarization and method name prediction. In this study, we expanded our work and investigated AdvFusion on Code Large Language Models (Code-LLMs), considering three new tasks: code generation, code translation, and commit message generation. We observed that different Code-LLMs/tasks exhibit different characteristics. In code generation, AdvFusion outperformed AdapterFusion but not other PEFT methods (LoRA, Compacter, and TaskAdapter). In commit message generation, AdapterFusion performed better than AdvFusion, and contrary to code generation, we found that the other PEFT methods do not have better performance. In code translation, AdvFusion performed worse than AdapterFusion overall, with the performance gap marginally widening as the model size increases. However, consistent with code generation, other PEFT methods showed better performance.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2511.02869

Country:

North America > United States (0.46)
North America > Canada (0.28)
Europe > United Kingdom (0.28)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

Adapting Language Models to Indonesian Local Languages: An Empirical Study of Language Transferability on Zero-Shot Settings

Putri, Rifki Afina

arXiv.org Artificial IntelligenceJul-3-2025

--In this paper, we investigate the transferability of pre-trained language models to low-resource Indonesian local languages through the task of sentiment analysis. We evaluate both zero-shot performance and adapter-based transfer on ten local languages using models of different types: a monolingual Indonesian BERT, multilingual models such as mBERT and XLM-R, and a modular adapter-based approach called MAD-X. T o better understand model behavior, we group the target languages into three categories: seen (included during pre-training), partially seen (not included but linguistically related to seen languages), and unseen (absent and unrelated in pre-training data). Our results reveal clear performance disparities across these groups: multilingual models perform best on seen languages, moderately on partially seen ones, and poorly on unseen languages. We find that MAD-X significantly improves performance, especially for seen and partially seen languages, without requiring labeled data in the target language. Additionally, we conduct a further analysis on tokenization and show that while subword fragmentation and vocabulary overlap with Indonesian correlate weakly with prediction quality, they do not fully explain the observed performance. Instead, the most consistent predictor of transfer success is the model's prior exposure to the language, either directly or through a related language.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2507.01645

Country: Asia > Indonesia (0.05)

Genre: Research Report > New Finding (0.89)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.65)

Add feedback

Limited-Resource Adapters Are Regularizers, Not Linguists

Fekete, Marcell, Robinson, Nathaniel R., Lavrinovics, Ernests, Jean-Baptiste, E. Djeride, Dabre, Raj, Bjerva, Johannes, Lent, Heather

arXiv.org Artificial IntelligenceJun-2-2025

Cross-lingual transfer from related high-resource languages is a well-established strategy to enhance low-resource language technologies. Prior work has shown that adapters show promise for, e.g., improving low-resource machine translation (MT). In this work, we investigate an adapter souping method combined with cross-attention fine-tuning of a pre-trained MT model to leverage language transfer for three low-resource Creole languages, which exhibit relatedness to different language groups across distinct linguistic dimensions. Our approach improves performance substantially over baselines. However, we find that linguistic relatedness -- or even a lack thereof -- does not covary meaningfully with adapter performance. Surprisingly, our cross-attention fine-tuning approach appears equally effective with randomly initialized adapters, implying that the benefit of adapters in this setting lies in parameter regularization, and not in meaningful information transfer. We provide analysis supporting this regularization hypothesis. Our findings underscore the reality that neural language processing involves many success factors, and that not all neural methods leverage linguistic knowledge in intuitive ways.

artificial intelligence, computational linguistic, natural language, (14 more...)

arXiv.org Artificial Intelligence

2505.24525

Country:

North America > United States (1.00)
Europe (1.00)
Africa (0.68)
Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)

Genre: Research Report > New Finding (1.00)

Industry: Education > Educational Setting > Online (0.67)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

How to Tune a Multilingual Encoder Model for Germanic Languages: A Study of PEFT, Full Fine-Tuning, and Language Adapters

Oji, Romina, Kunz, Jenny

arXiv.org Artificial IntelligenceJan-10-2025

This paper investigates the optimal use of the multilingual encoder model mDeBERTa for tasks in three Germanic languages -- German, Swedish, and Icelandic -- representing varying levels of presence and likely data quality in mDeBERTas pre-training data. We compare full fine-tuning with the parameter-efficient fine-tuning (PEFT) methods LoRA and Pfeiffer bottleneck adapters, finding that PEFT is more effective for the higher-resource language, German. However, results for Swedish and Icelandic are less consistent. We also observe differences between tasks: While PEFT tends to work better for question answering, full fine-tuning is preferable for named entity recognition. Inspired by previous research on modular approaches that combine task and language adapters, we evaluate the impact of adding PEFT modules trained on unstructured text, finding that this approach is not beneficial.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2501.06025

Country:

Europe > Sweden (0.46)
North America > United States > Minnesota (0.28)
Asia > Middle East (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Cross-lingual Back-Parsing: Utterance Synthesis from Meaning Representation for Zero-Resource Semantic Parsing

Kang, Deokhyung, Hwang, Seonjeong, Kim, Yunsu, Lee, Gary Geunbae

arXiv.org Artificial IntelligenceOct-1-2024

Recent efforts have aimed to utilize multilingual pretrained language models (mPLMs) to extend semantic parsing (SP) across multiple languages without requiring extensive annotations. However, achieving zero-shot cross-lingual transfer for SP remains challenging, leading to a performance gap between source and target languages. In this study, we propose Cross-Lingual Back-Parsing (CBP), a novel data augmentation methodology designed to enhance cross-lingual transfer for SP. Leveraging the representation geometry of the mPLMs, CBP synthesizes target language utterances from source meaning representations. Our methodology effectively performs cross-lingual data augmentation in challenging zero-resource settings, by utilizing only labeled data in the source language and monolingual corpora. Extensive experiments on two cross-language SP benchmarks (Mschema2QA and Xspider) demonstrate that CBP brings substantial gains in the target language. Further analysis of the synthesized utterances shows that our method successfully generates target language utterances with high slot value alignment rates while preserving semantic integrity. Our codes and data are publicly available at https://github.com/deokhk/CBP.

representation, target language, utterance, (15 more...)

arXiv.org Artificial Intelligence

2410.00513

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Middle East > Cyprus > Nicosia > Nicosia (0.04)
North America > Canada > Ontario > Toronto (0.04)
(3 more...)

Genre: Research Report > New Finding (0.34)

Technology: Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)

Add feedback

Soft Language Prompts for Language Transfer

Vykopal, Ivan, Ostermann, Simon, Šimko, Marián

arXiv.org Artificial IntelligenceJul-2-2024

Cross-lingual knowledge transfer, especially between high- and low-resource languages, remains a challenge in natural language processing (NLP). This study offers insights for improving cross-lingual NLP applications through the combination of parameter-efficient fine-tuning methods. We systematically explore strategies for enhancing this cross-lingual transfer through the incorporation of language-specific and task-specific adapters and soft prompts. We present a detailed investigation of various combinations of these methods, exploring their efficiency across six languages, focusing on three low-resource languages, including the to our knowledge first use of soft language prompts. Our findings demonstrate that in contrast to claims of previous work, a combination of language and task adapters does not always work best; instead, combining a soft language prompt with a task adapter outperforms other configurations in many cases.

adapter, configuration, representation, (16 more...)

arXiv.org Artificial Intelligence

2407.02317

Country:

North America > United States > Washington > King County > Seattle (0.04)
Europe > Romania > Sud - Muntenia Development Region > Giurgiu County > Giurgiu (0.04)
Europe > Czechia > South Moravian Region > Brno (0.04)
(13 more...)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.68)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.50)

Add feedback

Language Portability Strategies for Open-domain Dialogue with Pre-trained Language Models from High to Low Resource Languages

Njifenjou, Ahmed, Sucal, Virgile, Jabaian, Bassam, Lefèvre, Fabrice

arXiv.org Artificial IntelligenceJul-1-2024

In this paper we propose a study of linguistic portability strategies of large pre-trained language models (PLMs) used for open-domain dialogue systems in a high-resource language for this task. In particular the target low-resource language (L_T) will be simulated with French, as it lacks of task-specific resources and allows our human evaluation, when the source language (L_S) is English. For obvious reasons, recent works using such models for open-domain dialogue are mostly developed in English. Yet building specific PLMs for each possible target language supposes collecting new datasets and is costly. For this reason, trying to leverage all existing resources (PLMs and data) in both L_S and L_T , we wish to assess the performance achievable in L_T with different approaches. The first two approaches evaluate the usage of Neural Machine Translation (NMT) at different levels: TrainOnTarget where a L_S dataset is translated before fine-tuning in L_T and TestOnSource where a L_S model is coupled with NMT modules during inference. Then, the advent of BLOOM [2], the world first open-access multilingual large PLM, allow researchers to develop new approaches aiming to leverage not only the model's full accessibility but also its multilingualism and translation abilities. In this context the task is learned in L_S first and adapted to L_T using the MAD-X Adapter architecture [16]. In the two sets of experiments models are evaluated in spoken dialogue conditions with human and the strategies can be compared in terms of perceived interaction quality.

adapter, dataset, dialogue, (17 more...)

arXiv.org Artificial Intelligence

2407.01315

Country:

North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
Europe > France > Hauts-de-France > Nord > Lille (0.04)
Asia > Japan > Honshū > Kantō > Chiba Prefecture > Chiba (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.88)

Add feedback

Adapting Multilingual LLMs to Low-Resource Languages with Knowledge Graphs via Adapters

Gurgurov, Daniil, Hartmann, Mareike, Ostermann, Simon

arXiv.org Artificial IntelligenceJul-1-2024

This paper explores the integration of graph knowledge from linguistic ontologies into multilingual Large Language Models (LLMs) using adapters to improve performance for low-resource languages (LRLs) in sentiment analysis (SA) and named entity recognition (NER). Building upon successful parameter-efficient fine-tuning techniques, such as K-ADAPTER and MAD-X, we propose a similar approach for incorporating knowledge from multilingual graphs, connecting concepts in various languages with each other through linguistic relationships, into multilingual LLMs for LRLs. Specifically, we focus on eight LRLs -- Maltese, Bulgarian, Indonesian, Nepali, Javanese, Uyghur, Tibetan, and Sinhala -- and employ language-specific adapters fine-tuned on data extracted from the language-specific section of ConceptNet, aiming to enable knowledge transfer across the languages covered by the knowledge graph. We compare various fine-tuning objectives, including standard Masked Language Modeling (MLM), MLM with full-word masking, and MLM with targeted masking, to analyse their effectiveness in learning and integrating the extracted graph data. Through empirical evaluation on language-specific tasks, we assess how structured graph knowledge affects the performance of multilingual LLMs for LRLs in SA and NER, providing insights into the potential benefits of adapting language models for low-resource scenarios.

adapter, knowledge, language adapter, (13 more...)

arXiv.org Artificial Intelligence

2407.01406

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Germany > Saarland (0.04)
North America > Dominican Republic (0.04)
(7 more...)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

AAdaM at SemEval-2024 Task 1: Augmentation and Adaptation for Multilingual Semantic Textual Relatedness

Zhang, Miaoran, Wang, Mingyang, Alabi, Jesujoba O., Klakow, Dietrich

arXiv.org Artificial IntelligenceJun-7-2024

This paper presents our system developed for the SemEval-2024 Task 1: Semantic Textual Relatedness for African and Asian Languages. The shared task aims at measuring the semantic textual relatedness between pairs of sentences, with a focus on a range of under-represented languages. In this work, we propose using machine translation for data augmentation to address the low-resource challenge of limited training data. Moreover, we apply task-adaptive pre-training on unlabeled task data to bridge the gap between pre-training and task adaptation. For model training, we investigate both full fine-tuning and adapter-based tuning, and adopt the adapter framework for effective zero-shot cross-lingual transfer. We achieve competitive results in the shared task: our system performs the best among all ranked teams in both subtask A (supervised learning) and subtask C (cross-lingual transfer).

computational linguistic, linguistic, proceedings, (15 more...)

arXiv.org Artificial Intelligence

2404.0149

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > Canada > Ontario > Toronto (0.05)
Europe > Germany > Saxony > Leipzig (0.05)
(17 more...)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

TartuNLP at EvaLatin 2024: Emotion Polarity Detection

Dorkin, Aleksei, Sirts, Kairit

arXiv.org Artificial IntelligenceMay-2-2024

This paper presents the TartuNLP team submission to EvaLatin 2024 shared task of the emotion polarity detection for historical Latin texts. Our system relies on two distinct approaches to annotating training data for supervised learning: 1) creating heuristics-based labels by adopting the polarity lexicon provided by the organizers and 2) generating labels with GPT4. We employed parameter efficient fine-tuning using the adapters framework and experimented with both monolingual and cross-lingual knowledge transfer for training language and task adapters. Our submission with the LLM-generated labels achieved the overall first place in the emotion polarity detection task. Our results show that LLM-based annotations show promising results on texts in Latin.

adapter, knowledge transfer, submission, (14 more...)

arXiv.org Artificial Intelligence

2405.01159

Country:

Europe > Estonia > Tartu County > Tartu (0.05)
Asia > China (0.05)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.04)
(4 more...)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.37)

Add feedback